Construction of the Gmane corpus for examining the diffusion of lexical innovations

نویسندگان

  • Kyle Marek-Spartz
  • Paula Chesley
  • Hannah Sande
چکیده

Large-scale linguistic corpora, complete with information about speakers’ social networks as well as demographic and temporal information, allow for empirical validation of complex theories about the social interactions and linguistic properties leading to large-scale language change. We present ongoing work on the diffusion of lexical innovations using a corpus we have compiled from the Gmane electronic mailing list archive, a publicly available dataset of 13,494 mailing lists and 117,606,370 messages to date. Focusing initially on a single list, we derive a social network for actor-speakers, give lexical and network statistics, and empirically categorize tie strength across speakers. Initial explorations of the Gmane corpus suggest suitability for research on language change.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Examining the Effect of Ideology and Idiosyncrasy on Lexical Choices in Translation Studies within the CDA Framework

Using a critical discourse analytic model of translation criticism, the present study attempts to explore the effect of ideology and idiosyncrasy on the lexical choices in translation studies. The study employed a descriptive approach to answer two research questions: Is there any relationship between ideology and idiosyncratic features of translators' lexical choices? And if yes, can it be ana...

متن کامل

A Corpus-based Study of Lexical Bundles in Discussion Section of Medical Research Articles

There has been increasing interest in utilizing corpora in linguistic research and pedagogy in recent years. Rhetorical organization of different sections of research articles may appear similar in various disciplines, but close examination may show subtle differences nonetheless. One of the features that has been at the center of attention especially in recent years is the idiomaticity of a di...

متن کامل

A Corpus-Based Study of the Lexical Make-up of Applied Linguistics Article Abstracts

This paper reports results from a corpus-based study that explored the frequency of words in the abstracts of applied linguistics journal articles. The abstracts of major articles in leading applied linguists journals, published since 2005 up to November 2001 were analyzed using software modules from the Compleat Lexical Tutor. The output includes a list of the most frequent content words, list...

متن کامل

The Impact of Teaching Corpus-based Collocation on EFL Learners' Writing Ability

Abstract The present study explores the impact of corpus-based collocation instruction on intermediate Iranian EFL learners' writing ability. For this study, 84 Iranian learners, studying English as a foreign language in Bayan Institute, Iran, were selected and were randomly divided into two groups, experimental and control. Conventional methods of writing instruction were taught to the control...

متن کامل

The Impact of Teaching Corpus-based Collocation on EFL Learners' Writing Ability

Abstract The present study explores the impact of corpus-based collocation instruction on intermediate Iranian EFL learners' writing ability. For this study, 84 Iranian learners, studying English as a foreign language in Bayan Institute, Iran, were selected and were randomly divided into two groups, experimental and control. Conventional methods of writing instruction were taught to the control...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012